Machine Translation Enhanced Automatic Speech Recognition

نویسندگان

  • Matthias Paulik
  • Christian Fügen
چکیده

In human-mediated translation scenarios, a human interpreter translates between a source and a target language using either a spoken or a written representation of the source language. In this work the recognition performance on the speech of the human translator spoken in the target language (English) is improved by taking advantage of the source language (Spanish) representations. For this, machine translation techniques are used to translate between the source and target language resources and then bias the target language speech recognizer towards the gained knowledge, hence the name Machine Translation Enhanced Automatic Speech Recognition (MTE-ASR). Different basic MTE-ASR techniques are investigated, namely restricting the search vocabulary, selecting hypotheses from n-best lists and applying cache and interpolation schemes to language modeling. Given a written representation of the source language and with the help of a non-iterative combination of the most successful basic techniques, it was possible to outperform the English baseline ASR system by a relative word error rate reduction of 30.6%. In the case of a spoken source language representation, where a source language ASR has to be used at first to create a further processable written representation, the reduction is still 23.2%. With the help of an iterative system design, which recursively applies the improved ASR output to enhance the involved MT system(s) for a further ASR improvement, it was possible to further increase these word error rate reductions to 37.7% and 29.9% respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Automatic text dictation in computer-assisted translation

In this paper, we study the incorporation of statistical machine translation models to automatic speech recognition models in the framework of computer-assisted translation. The system is given a source language text to be translated and it shows the source text to the human translator to translate it orally. The system captures the user speech which is the dictation of the target language sent...

متن کامل

Document driven machine translation enhanced ASR

In human-mediated translation scenarios a human interpreter translates between a source and a target language using either a spoken or a written representation of the source language. In this paper we improve the recognition performance on the speech of the human translator spoken in the target language by taking advantage of the source language representations. We use machine translation techn...

متن کامل

Enhancements in Statistical Spoken Language Translation by De-normalization of ASR Results

Spoken language translation (SLT) has become very important in an increasingly globalized world. Machine translation (MT) for automatic speech recognition (ASR) systems is a major challenge of great interest. This research investigates that automatic sentence segmentation of speech that is important for enriching speech recognition output and for aiding downstream language processing. This arti...

متن کامل

MISTRAL: a Statistical Machine Translation Decoder for Speech Recognition Lattices

This paper presents MISTRAL, an open source statistical machine translation decoder dedicated to spoken language translation. While typical machine translation systems take a written text as input, MISTRAL translates word lattices produced by automatic speech recognition systems. The lattices are translated in two passes using a phrase-based model. Our experiments reveal an improvement in BLEU ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005